Skip to content

[GLUTEN-12071][VL] Respect HadoopConf write options in Velox native Parquet writer#12072

Merged
FelixYBW merged 3 commits into
apache:mainfrom
wecharyu:GLUTEN-12071
May 15, 2026
Merged

[GLUTEN-12071][VL] Respect HadoopConf write options in Velox native Parquet writer#12072
FelixYBW merged 3 commits into
apache:mainfrom
wecharyu:GLUTEN-12071

Conversation

@wecharyu
Copy link
Copy Markdown
Contributor

@wecharyu wecharyu commented May 11, 2026

What changes are proposed in this pull request?

This PR updates Velox native Parquet write parameter generation to build write options from:

session.sessionState.newHadoopConfWithOptions(write.options)

This matches Spark's write path in InsertIntoHadoopFsRelationCommand, where Spark merges session HadoopConf and explicit write options before constructing the write job.

Fix #12071

How was this patch tested?

Add new Unit test:

build/mvn test -Dtest=none -Pspark-4.1 -Pscala-2.13 -Pjava-17 -Pbackends-velox -Pspark-ut -DargLine="-Dspark.test.home=/opt/shims/spark41/spark_home/ -Dspark.testing=true" -pl backends-velox -Dsuites="org.apache.spark.sql.execution.VeloxParquetWriteHadoopConfSuite"

Was this patch authored or co-authored using generative AI tooling?

Codex GPT-5.5

@github-actions github-actions Bot added the VELOX label May 11, 2026
@github-actions github-actions Bot added the CORE works for Gluten Core label May 12, 2026
@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@wecharyu
Copy link
Copy Markdown
Contributor Author

@FelixYBW @zhouyuan could you take a look on this PR? Thanks!

@FelixYBW
Copy link
Copy Markdown
Contributor

Only a few options are supported in Velox today
https://github.com/apache/gluten/blob/main/docs/velox-parquet-write-configuration.md

@FelixYBW FelixYBW merged commit 3bf0942 into apache:main May 15, 2026
60 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CORE works for Gluten Core VELOX

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Gluten native Parquet writer ignores spark.hadoop Parquet write configs

2 participants